Skip to main content

All Questions

0votes
0answers
271views

Correct method to report Randomized Search CV results

I have searched online but I still cannot find a definitive answer on how to "correctly" report the results from hyperparameter tuning a machine learning model; though, this may just be some ...
user167433's user avatar
1vote
2answers
257views

Why do we need hyperparameter tuning in Scikit learn? Doesn't sk learn models by default give best model?

When I have the option to build a classifier like this directly clf = RandomForestClassifier() why do we perform tuning by restricting the parameters like this <...
Hola's user avatar
0votes
0answers
214views

SVM taking too much time to train

I'm trying to train my ML model with Svm.svc from sklearn, but it is taking so much time, it won't even train for once. This happens only when kernel function is used. Currently i selected 10 Features ...
OctoCat's user avatar
1vote
1answer
72views

The Sklearn train_test_split function is create training data and test data which are not similar

I am working on loan default data and my model is not able to make accurate predictions on the test set because the the default percentage on the test set is very different from that of the training ...
J.Sriram's user avatar
0votes
2answers
76views

Which random_state to use in test_train_split when deploying final model?

I have developed a Random Forest that gives varying results depending on the random state of the test train split. This is normal, because a lot of the values in the data are extreme, without being ...
Nemo_the_scientist's user avatar
1vote
2answers
375views

How to remove test set so that model uses all data as training data?

I have developed a RandomForest classification model and I am pretty satisfied with the results on the test set. Now, my next step is to deploy the model. Before ...
Nemo_the_scientist's user avatar
4votes
2answers
2kviews

Flipping the labels in a binary classification gives different model and results

I have an imbalanced dataset and I want to train a binary classifier to model the dataset. Here was my approach which resulted into (relatively) acceptable performance: 1- I made a random split to get ...
Farzad's user avatar
0votes
0answers
52views

How to improve validation score

I am working on time series classification. My data has 4 classes. I used this paper's architecture on my data (1611.06455). However, my results look like this : . Here is a link to my notebook I ...
Ayan Mitra's user avatar
1vote
1answer
193views

Feature Engineer each class separately in Binary Classification

I have an imbalanced tabular dataset, my problem is a binary classification. The dataset is strongly imbalanced so I have performed oversampling, but it did not solve the issue, you can find the ...
bechirjamoussi's user avatar
1vote
0answers
299views

Low F1-Score due to Imbalanced Dataset even after resampling

I am performing a Binary Classification over an imbalanced dataset: 0: 16,263 1: 214 I have used multiple oversampling, undersampling, and combination techniques, below are the results that I have ...
bechirjamoussi's user avatar
1vote
0answers
26views

Laben Encoding for Target Classes: Any Integer or Consecutive Integers from Zero?

I'm handling an very conventional supervised classification task with three (mutually exclusive) target categories (not ordinal ones): ...
Hendrik's user avatar
  • 8,747
1vote
0answers
21views

Can I "fit" a k-nearest neighbors classifier without precomputing anything?

I am currently trying to fit a KNeighborsClassifier (scikit-learn implementation) to about a gigabyte of training data. From every resource I've read online, a k-nearest-neighbors classifier is a &...
Reggie Simmons's user avatar
0votes
0answers
325views

LightGBM predict_proba in thousandths place

Can someone explain to me how my lightgbm classification model's predict_proba() is in thousandths place for the positive class: ...
Tinkinc's user avatar
0votes
1answer
721views

Which classification_report metrics are appropriate to report/interpret for a binary label? Individual or macro average for both classes? scikit-learn

First, please forgive my ignorance; I am a newbie but dedicated to learning more. Example: I have a using a random forest classifier to predict a binary outcome. The binary outcome equals 1 if people ...
iamtheonewhoknocks's user avatar
1vote
1answer
251views

Imbalanced classification task – Discrepancy between learning curves and test set evaluation

I have a binary classification task related to customer churn for a bank. The dataset contains 10,000 instances and 11 features. The target variable is imbalanced (80% remained as customers (0), 20% ...
KK_o7's user avatar

153050per page
close